Taxonomy Induction Using Hierarchical Random Graphs
نویسندگان
چکیده
This paper presents a novel approach for inducing lexical taxonomies automatically from text. We recast the learning problem as that of inferring a hierarchy from a graph whose nodes represent taxonomic terms and edges their degree of relatedness. Our model takes this graph representation as input and fits a taxonomy to it via combination of a maximum likelihood approach with a Monte Carlo Sampling algorithm. Essentially, the method works by sampling hierarchical structures with probability proportional to the likelihood with which they produce the input graph. We use our model to infer a taxonomy over 541 nouns and show that it outperforms popular flat and hierarchical clustering algorithms.
منابع مشابه
Word Sense Induction Disambiguation Using Hierarchical Random Graphs
Graph-based methods have gained attention in many areas of Natural Language Processing (NLP) including Word Sense Disambiguation (WSD), text summarization, keyword extraction and others. Most of the work in these areas formulate their problem in a graph-based setting and apply unsupervised graph clustering to obtain a set of clusters. Recent studies suggest that graphs often exhibit a hierarchi...
متن کاملThe ContrastMedium Algorithm: Taxonomy Induction From Noisy Knowledge Graphs With Just A Few Links
In this paper, we present ContrastMedium, an algorithm that transforms noisy semantic networks into full-fledged, clean taxonomies. ContrastMedium is able to identify the embedded taxonomy structure from a noisy knowledge graph without explicit human supervision such as, for instance, a set of manually selected input root and leaf concepts. This is achieved by leveraging structural information ...
متن کاملHybrid Architecture for Web Search Systems Based on Hierarchical Taxonomies
Search systems based on hierarchical taxonomies provide a specific type of search functionality that is not provided by conventional search engines. For instance, using a taxonomy, the user can look for documents related to just one of the categories of the taxonomy. This paper describes a hybrid data architecture that improves the performance of restricted searches for a few categories of a ta...
متن کاملUsing Bayesian Classification for Aq-based Learning with Constructive Induction
To obtain potentially interesting patterns and relations from large, distributed, heterogeneous databases, it is essential to employ an intelligent and automated KDD (Knowledge Discovery in Databases) process. One of the most important methodologies is an integration of diverse learning strategies that cooperatively performs a variety of techniques and achieves high quality knowledge. AqBC is a...
متن کاملTreeMatrix: A Hybrid Visualization of Compound Graphs
We present a hybrid visualization technique for compound graphs (i.e., networks with a hierarchical clustering defined on the nodes) that combines the use of adjacency matrices, node-link and arc diagrams to show the graph, and also combines the use of nested inclusion and icicle diagrams to show the hierarchical clustering. The graph visualized with our technique may have edges that are weight...
متن کامل